Picture for Zhao Jin

Zhao Jin

SPA-Cache: Singular Proxies for Adaptive Caching in Diffusion Language Models

Add code
Jan 30, 2026
Viaarxiv icon

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Add code
Dec 31, 2025
Viaarxiv icon

AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization

Add code
Aug 06, 2025
Figure 1 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 2 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 3 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Figure 4 for AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization
Viaarxiv icon

FAF: A Feature-Adaptive Framework for Few-Shot Time Series Forecasting

Add code
Jun 24, 2025
Viaarxiv icon

Self-Supervised Multi-Part Articulated Objects Modeling via Deformable Gaussian Splatting and Progressive Primitive Segmentation

Add code
Jun 11, 2025
Viaarxiv icon

ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning

Add code
Jun 06, 2025
Viaarxiv icon

MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval

Add code
May 26, 2025
Viaarxiv icon

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Add code
May 24, 2025
Viaarxiv icon

Where is this coming from? Making groundedness count in the evaluation of Document VQA models

Add code
Mar 24, 2025
Viaarxiv icon

Correctness Learning: Deductive Verification Guided Learning for Human-AI Collaboration

Add code
Mar 10, 2025
Viaarxiv icon